1.1 Scanner:from String to Tokens

In the principles of compilation, Dart source code strings require a Tokenizer to extract the basic elements (Tokens) from the code. This part of the work is handled by the scanner module of the Dart _fe_analyzer_shared package.

scanner\scanner.dart is the entry point of this module, offering two methods: scan and scanString, for converting strings into a list of Tokens. The difference is that the former processes UTF-8 byte streams, while the latter handles ordinary strings.

These two scan methods internally rely on a group of Scanner classes: AbstractScanner as the base class, from which StringScanner and Utf8BytesScanner are derived. For these Scanner classes, the core function is the tokenize method, responsible for the specific conversion work.

Inside the tokenize method, there is a loop that traverses the string. Within this loop, a large number of layered rules are established, which determine token types such as keywords based on the relationships between characters.

The resulting list of Tokens is of the type Token. It internally forms a linked list, connecting to the next Token, until the end.

Token is the first level of processing of the source code. These Tokens then enter the Parser module for a second level of processing.

Note: How many types of Tokens are there in the Dart language? Refer to "Dart SDK Token Inheritance Relationship".


本文作者:Maeiee

本文链接:1.1 Scanner:from String to Tokens

版权声明:如无特别声明,本文即为原创文章,版权归 Maeiee 所有,未经允许不得转载!


喜欢我文章的朋友请随缘打赏,鼓励我创作更多更好的作品!